Detecting Parts for Action Localization

نویسندگان

Nicolas Chesneau

Grégory Rogez

Karteek Alahari

Cordelia Schmid

چکیده

In this paper, we propose a new framework for action localization that tracks people in videos and extracts full-body human tubes, i.e., spatio-temporal regions localizing actions, even in the case of occlusions or truncations. This is achieved by training a novel human part detector that scores visible parts while regressing full-body bounding boxes. The core of our method is a convolutional neural network which learns part proposals specific to certain body parts. These are then combined to detect people robustly in each frame. Our tracking algorithm connects the image detections temporally to extract fullbody human tubes. We apply our new tube extraction method on the problem of human action localization, on the popular JHMDB dataset, and a very recent challenging dataset DALY (Daily Action Localization in YouTube), showing state-of-the-art results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combined model for detecting, localizing, interpreting and recognizing faces

This work describes a method that combines face detection, localization, part interpretation and recognition, and which is capable of learning from very limited data, in a semi-supervised or even fully unsupervised manner. Current state-of-the-art techniques for face detection and recognition are subject to two major limitations: extensive training requirement, often demanding tens of thousands...

متن کامل

Feasibility of detecting and localizing radioactive source using image processing and computational geometry algorithms

We consider the problem of finding the localization of radioactive source by using data from a digital camera. In other words, the camera could help us to detect the direction of radioactive rays radiation. Therefore, the outcome could be used to command a robot to move toward the true direction to achieve the source. The process of camera data is performed by using image processing and computa...

متن کامل

Invertible chaotic fragile watermarking for robust image authentication.dvi

Fragile watermarking is a popular method for image authentication. In such schemes, a fragile signal that is sensitive to manipulations is embedded in the image, so that it becomes undetectable after any modification of the original work. Most algorithms focus either on the ability to retrieve the original work after watermark detection (invertibility) or on detecting which image parts have ∗Th...

متن کامل

Compositional Structure Learning for Action Understanding

The focus of the action understanding literature has predominately been classification, however, there are many applications demanding richer action understanding such as mobile robotics and video search, with solutions to classification, localization and detection. In this paper, we propose a compositional model that leverages a new mid-level representation called compositional trajectories an...

متن کامل

Human Focused Action Localization in Video

We propose a novel human-centric approach to detect and localize human actions in challenging video data, such as Hollywood movies. Our goal is to localize actions in time through the video and spatially in each frame. We achieve this by first obtaining generic spatiotemporal human tracks and then detecting specific actions within these using a sliding window classifier. We make the following c...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1707.06005 شماره

صفحات -

تاریخ انتشار 2017

Detecting Parts for Action Localization

نویسندگان

چکیده

منابع مشابه

Combined model for detecting, localizing, interpreting and recognizing faces

Feasibility of detecting and localizing radioactive source using image processing and computational geometry algorithms

Invertible chaotic fragile watermarking for robust image authentication.dvi

Compositional Structure Learning for Action Understanding

Human Focused Action Localization in Video

عنوان ژورنال:

اشتراک گذاری